Performance characterization of containerization for HPC workloads on InfiniBand clusters: an empirical study
نویسندگان
چکیده
Abstract Containerization technology offers an appealing alternative for encapsulating and operating applications (and all their dependencies) without being constrained by the performance penalties of using Virtual Machines and, as a result, has got interest High-Performance Computing (HPC) community to obtain fast, customized, portable, flexible, reproducible deployments workloads. Previous work on this area demonstrated that containerized HPC can exploit InfiniBand networks, but ignored potential multi-container which partition processes belong each application into multiple containers in host. Partitioning be useful when virtual machines constraining them single NUMA (Non-Uniform Memory Access) domain. This paper conducts systematical study with different network fabrics protocols, focusing especially Infiniband networks. We analyze impact container granularity its processor memory affinity improve applications’ performance. Our results show default Singularity achieve near bare-metal does not support fine-grain deployments. Docker Singularity-instance have similar behavior terms deployment schemes affinity. differs several depends well communication patterns message size. Moreover, are also more impacted computation allocation, because that, they better.
منابع مشابه
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?
Dense Multi-GPU systems have recently gained a lot of attention in the HPC arena. Traditionally, MPI runtimes have been primarily designed for clusters with a large number of nodes. However, with the advent of MPI+CUDA applications and CUDA-Aware MPI runtimes like MVAPICH2 and OpenMPI, it has become important to address efficient communication schemes for such dense Multi-GPU nodes. This couple...
متن کاملA Comprehensive Performance Evaluation of OpenSHMEM Libraries on InfiniBand Clusters
OpenSHMEM is an open standard that brings together several long-standing vendor-specific SHMEM implementations and allows applications to use SHMEM in a platform-independent fashion. Several implementations of OpenSHMEM have become available on clusters interconnected by InfiniBand networks, which has gradually become the de facto high performance network interconnect standard. In this paper, w...
متن کاملEmpirical Performance Models for Java Workloads
Java is widely deployed on a variety of processor architectures. Consequently, an understanding of microarchitecture level Java performance is critical to optimize current systems and to aid design and development of future processor architectures for Java. Although this is facilitated by a rich set of processor performance counters featured on several contemporary processors, complex processor...
متن کاملThe Case For Colocation of HPC Workloads
The current state of practice in supercomputer resource allocation places jobs from different users on disjoint nodes both in terms of time and space. While this approach largely guarantees that jobs from different users do not degrade one another’s performance, it does so at high cost to system throughput and energy efficiency. This focused study presents job striping, a technique that signifi...
متن کاملNetwork Performance in Distributed HPC Clusters
Linux-based clusters have become prevalent as a foundation for High Performance Computing (HPC) systems. As these clusters become more affordable and available, and with the emergence of high speed networks, it is becoming more feasible to create HPC grids consisting of multiple clusters. One of the attractions of such grids is the potential to scale applications across the various clusters. Th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cluster Computing
سال: 2021
ISSN: ['1386-7857', '1573-7543']
DOI: https://doi.org/10.1007/s10586-021-03460-8